Speech Emotion Recognition with Data Augmentation and Layer-wise Learning Rate Adjustment
نویسندگان
چکیده
In this work, we design a neural network for recognizing emotions in speech, using the standard IEMOCAP dataset. Following the latest advances in audio analysis, we use an architecture involving both convolutional layers, for extracting highlevel features from raw spectrograms, and recurrent ones for aggregating long-term dependencies. Applying techniques of data augmentation, layerwise learning rate adjustment and batch normalization, we obtain highly competitive results, with 64.5% weighted accuracy and 61.7% unweighted accuracy on four emotions. Moreover, we show that the model performance is strongly correlated with the labeling confidence, which highlights a fundamental difficulty in emotion recognition.
منابع مشابه
A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملStructural model of academic adjustment based on learning strategies and learning styles with the mediating role of cognitive emotion regulation strategies in university students
The purpose of the current research was to determine the mediating role of cognitive emotion regulation strategies in the relationship between learning styles and students' academic adjustment. The current research is descriptive-correlational in terms of methodology. The statistical population was all the students of Shahid Bahoner University in Kerman in the academic year of 2021-2022 (12614)...
متن کاملImproving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms
One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.05630 شماره
صفحات -
تاریخ انتشار 2018